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The genetic algorithm (GA) is a machine-based optimization 
routine which connects evolutionary learning to natural genetic laws 
[1-2]. The present work addresses the problem of obtaining the 
dominant takeover regimes in the GA dynamics. Estimated GA run 
times are computed for slow and fast convergence in the limits of 
high and low fitness ratios. Using Euler’s device for obtaining partial 
sums in closed forms, the result relaxes the previously held 
requirements for long time limits. Analytical solutions reveal that 
appropriately accelerated regimes can mark the ascendancy of the 
most fit solution. In virtually all cases, the weak (logarithmic) 
dependence of convergence time on problem size demonstrates the 
potential for the GA to solve large N-P complete problems. 


A central issue in how the GA processes strings (solution encoded 
genomes) is takeover times for the most fit genome [2]. Takeover 
times here refers to how the computational time or complexity scales 
with larger problem sizes. Short takeover times correspond to rapid 
convergence onto the most fit individual. Conversely, long takeover 
times correspond to slow convergence and sluggish dynamics. A 
historical discussion of computational complexity appears in 
reference 3. 

Here we solve the standard GA models for generational reproduc- 
tion [4], evaluate the takeover times, and compare the results using 
various approximation techniques. The approximations take place 
only in time and fitness space, such that the exact results can be com- 
pared directly. The aim is to identify and understand why the GA 
sorts through some regions of the fitness landscape so efficiently but 
gets bogged down in other regions. 

For discrete time steps between generations, f=*{0,l,2,..}, let P ; t , 
correspond to the proportion of alleles (or bit string values) set to the 
value 1 for a particular allele position i at generation t. Let P I( 0 rep- 
resent P for the founder generation, M). For simplicity, all subse- 
quent work will treat binary genetic algorithms which have alleles 
possessing either of two values, 0 or 1; the multivalue case is a trivial 
generalization at this level of proof. Let f\ correspond to the 
organism’s fitness (some survival probability) sampled with allele 
value 1 in a particular position j. Likewise, take/ 0 to represent the 
fitness of all organisms sampled with allele value 0 in position j. For 
any defined fitness ratio, r-f[/f 0 , the value will be considered time- 
independent and constant across generations. 

Thus to begin, assume generational reproduction on a binary fit- 
ness landscape i f 0 J\) with fitness ratio, r=f\/f 0 . Previous results have 
established an equivalence condition to match generational and 
steady-state reproduction, so we consider the following derivations to 
develop with some parallel applicability for steady-state GAs (e.g. 
Genitor, etc.). In the binary GA [5], generational reproduction implies 
that 

P M =(fl IS,)P, ( 1 ) 

where S t defines the average population fitness, S , ^/{P^-Pdfo • 


An iterated recursion relation is complicated by the appearance of P t 
in the average population fitness, S,. Here S,=P,/j+ (1 -P,)f 0 is the 
total average population fitness. Physically the last term represents the 
copying of one individual with a reproductive rate,/j/5 f . 


The recursion relation governing inverse population growth is 

(X=l/P)[5]: 1 v 

= l -j + -~ = A + BX t _ l . (2) 

Iterating and solving as a function of the fitness ratio and time gives: 

. (3) 

0 n= 0 V ' r 

Equation (3) can be inverted to solve for population dynamics, P,: 


P,= 


py 


l + P o r/-\r~l) 


( 4 ) 


where the term T, is the binomial series 



( 5 ) 


The aim of this analysis is to solve for closed form results in the lim- 
its of fitness space for long and short times. The approximate limits 
will parameterize the GA convergence for arbitrary fitness values. 


The Case of Long Times (t — >°ol and Fitness Ratios r>l 


For an infinite series summed with large fitness ratios, then for 
(!//-)< 1, the series (5) converges according to 


^7 x” — 1 + x + x^ +....— (1 — x) 
n=0 

r, = (i--r‘ ■ 

t V r > r-1 


(6) 

( 7 ) 


Thus, in the limit of long times and fitness ratios greater than unity, 
the population changes according to the dynamical equation: 

P 7 

■ ( 8 ) 

1 + Pj 

Equation (8) can be solved for how the computational complexity 
varies with the population size, n. Two cases are examined, the worst 
and average complexities which in turn depend on the expected fre- 
quency of the final solution appearing in the initial population, P 0 , In 
the average case, the initial population has a random frequency for the 
best solution, thus P o =0.5. Alternatively in the worst case, the initial 
population has a minimum frequency for the best solution, thus 
P 0 =l/n. Near convergence the final population approaches unity for 
all cases, thus, P/ = l-(l/n)=(n-l)/n. With this dependence on popu- 
lation size, it is possible to solve directly for the takeover time (or 
computational complexity) of the most fit member 


ln[P// P 0 (l-P f) 


Hr) 


( 9 ) 



( 21 ) 


For worst case convergence times, then 
In[n(n-1)] 


6 - = - 


ln(r) 


0[ln(«)] . 


Similarly for average convergence times, then 
ln[2(«-l)] 


tr=- 


ln(r) 


~ Otln(n)] . 


( 10 ) 

( 11 ) 


Both cases for takeover times (10-11) demonstrate the same weak 
logarithmic dependence on population size. 


The Case of Both Arbitrary Times and Fitness Ratios 

For a finite series summed with arbitrary fitness ratios, then the 
series (5) converges according to 


„.o <*-*> 


r, = 


1 — (1 / r) 1 1 

l-(l/r) r t~ 1 


r-1 


( 12 ) 


(13) 


Thus, for finite times and arbitrary fitness ratios, the population 
changes according to the dynamical equation: 


py 


py 


P= La CsL . ( 14 ) 

1 + P 0 r‘ (1 -P 0 ) + P 0 r‘ 

This agrees with the result derived independently by Deb and 
Goldberg [6] and Ankenbrandt [4]. 


Equation (5) can be solved for the computational complexity with 
the population size, n and fitness ratio, r. 

In 

= ~ ln(r) ’ 

Again, two cases are examined, the worst and average complexities. 
For worst case convergence times, then 

^ = ~ I in ( ( n ^ 1) ~ C,[ln(n) ] * 06) 

Similarly for average convergence times, then 

^ = ni^r~ 0[In(n)] • o?) 

Both cases for takeover times demonstrate the same weak logarith- 
mic dependence on population size. 

The Case of Long Times (t~>o°) and Low Fitness Ratios (Veil 


P/0-p o )1 

py-Pf) 


ln[(/i-l) 2 /n 
“ In(r) 

For large populations, then the logarithmic numerator goes to 

the limit, ln[n — 2 + (l / n)] = ln(n — 2). For small populations, n— >0, 
then the logarithmic numerator goes to the limit, ln[n-2 + (l / n)] 
= ln[(l/ n)-2\. 


Similarly for average convergence times, then 
, _ ln[(n-l)/n] 
lc ~ ln(r) 


( 22 ) 


For large populations, «-><», then the logarithmic numerator goes to 
the limit, ln[l + (1 / n)] = 0. 


The Case of Finite Times and Arbitrary Fitness Ratios /• 


For a finite series summed with arbitrary fitness ratios, then the 
series (5) converges according to an ingenious tool sometimes called 
Euler’s device [7]. For this case, the series is rewritten for z=(-l /r) 
and the geometric series can be summed for any intermediate time (in 
powers of p-2i q= 1,2,3...) 

t 

JV = 1 -z + z 2 -z 3 +... (23) 

M =0 

t 

£z" = l -z + z 2 -z 3 +... . (24) 

«= 0 

To evaluate any intermediate sum, J 2 p_i, Euler’s device gives: 

r, = (1 - z) = 1 + 1 / r,r 2 p-i = (i +z p )r p _, = [1 + (1 / rf ]r p _, . ( 25 ) 

This case represents the main exposition of this paper. It enables one 
to inspect the GA dynamics in intermediate time intervals without a 
lack of analytical generality. For any intermediate generation, the 
series can be summed in terms of previously known terms. As an 
example, consider the series up to generation 15: 

r 3 = (1 + 1 / ,- 2 )r, ; r 7 = (1 + 1 / r 4 )r 3 

= l + l//- + l/r 2 +...+l/r 7 ;r 15 =(l + l/r 8 )r 7 . (26) 

Thus, in the limit of any finite time and arbitrary fitness ratios, the 
population changes according to the dynamical equation: 

PlP ~ X ” 1 + P 0 (l + 1 / r p y ' p _ x r lp ~ 2 {r - 1) ' ^ 

Fig. 1 shows the stepped amplification for discrete iterations using 
Euler’s device on the series sum and the population. 


For an infinite series summed with low fitness ratios, then the 
series (10) converges according to 


t-, _ 1 - (1 / f / 1 r 1 — 1 

' 1 — (1 / r) r t - 1 r-1 

= (l-r)~ 1 forr<l . 


lim 

I — >oo 


r'-l 

r-1 


(18) 


Thus, for infinite times and low fitness ratios, the population changes 
according to the dynamical equation: 


P,= 


py 


' i+/y M (r-l)iy 


P r 

‘o' 

1 ~P n 


(19) 


Equation (5) can be solved for the computational complexity with the 


population size, n. 


A p A '- p oV p o 

ln(r) 


( 20 ) 


Again, two cases are examined, the worst and average complexities. 
For worst case convergence times, then 


The principal value of the partial summation formalism is to allow 
intermediate stages of the growth cycle to be expressed in closed 
form without resorting to any assumptions of long times or con- 
straints on the fitness ratio. The appeal is that programmed changes in 
population (injection or withdrawal of strings) can be undertaken 
without loss of analytic versatility. Additionally, the time complexity 
for regions of rapid convergence (following such injection or with- 
drawal) can be monitored without resetting the population balance 
with a new fitness ratio. Finally, the appeal of a 21 representation 
space for population changes automatically suggests a simple map- 
ping onto the hypercube of available search space. The generation 
steps thus naturally can be fitted to vertices of an ever enlarging 
(hypercube) search space. 

Not only does (27) give the partial summation in a closed 
(polynomial) form, but also allows for the immediate production of 
interval summations, e.g. the growth that occurred between genera- 
tions 7 and 15 (orjo-1 to 2p~l). Rewriting Euler’s device in terms of 
q for powers of p=2i gives: 




/ 


Yx n = l + x + x 2 +...= t 

(32) 

n = 0 


r, = 1 + 1 + 1 +.. .+i=7 . 

(33) 


Thus, in the limit of arbitrary times and equal fitness ratios, the popu- 
lation remains unchanged according to the dynamical equation: 


P, =■ 


P n r l 


\ + P 0 ty\r -\ ) 




(34) 


A somewhat shorter derivation follows from observing that 
(r-l)=0 in this case, thus for any finite number of terms in the sum- 
mation, the dependence on time should vanish entirely from the 
denominator of (30). 

Summary of Results 


Fig. 1 . Stepped population growth and series sum using Euler’s device 
to obtain partial summations without constraining the fitness space or 
simulation time. The bars highlight the points of partial summation on a 
binary landscape ( t*=2p-\ where p=2Q). 

^ :r 2’ +1 -l = ( 1 + ^) r 2^-l < 28 > 

^- 1: V‘-i=( 1 + ^‘)v>-i • (29) 

By subtracting (29) from (28) and rewriting in terms of p gives the 
interval summation formula: 

r 2p-l -r p-l =r 2p-l ~( 1 + 2 />/2 ) r p/2-l • ( 30 ) 

For example, the population growth between generations 7 and 15 
can be written in exact closed form as: 

r i5 _r 7 = r i5“( 1 + 1/r4 ) r 3 • ( 31 ) 

As shown schematically in Fig. 2, a complete interval analysis 
becomes possible for subsets of the partial summation. By breaking 
the population growth curve into discrete intervals, a useful formal- 
ism evolves for addressing intermediate changes in GA dynamics 
(e.g. withdrawal or addition of strings, changes in fitness). Thus the 
objective of tracking regions of accelerated convergence is simplified 
without loss of mathematical generality. 



Fig. 2. Schematic of interval summations 
within the partial sums of Euler’s device. 


The Case of Arbitrary Times (t— >°°) and Fitness Ratios r^l 

For completeness sake, consider the trivial case of equal fitness, r=l. 
When the two solutions have indistinguishable performance, the 
population should remain fixed on average at the initial value, P 0 . As 
a check on the previous formalism, such a result does indeed follow 
when the infinite series sums for equal fitness, then for (1//*)=1 the 
series (5) converges according to 


A summary of the various approximations is shown in Fig. 3. 
The graphical difference between exact and approximate population 
dynamics is highlighted in Fig. 4. The time complexities calculated in 
the various approximate limits are compared in Fig. 5. In general the 
weak (logarthmic) dependence of the GA processing on problem size 
gives it great potential compared to other sorting and search routines 
(Fig. 6). 
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Fig. 3. Phase diagram for series summation in approximate 
limits of short and long times and arbitrary fitness values. 


Much work has been published on the compromises struck 
between rapid GA convergence and processing parameters (e.g. ref- 
erence 6). A preliminary discussion here presents the fundamental 
issue in terms of accelerated regions of GA convergence. In general, 
GA populations move toward optimum following an S-curve or 
logistic growth. The central steep ascendency of the S-curve 
corresponds to highly efficient GA processing. In this regime, genetic 
mixing improves overall performance, such that neither good strings 
get heavily disrupted (strong crossover) nor do bad strings get unnec- 
essarily preserved (low selection). Thus GA efficiency can match 
with the genetic terms of a balanced tradeoff struck between diversity 





and selectivity. Over time, this tradeoff can be imagined schemati- 
cally to combine two composite operators. For the same operations 
mapped into time and fitness space, neighborhoods of high diversity 
correspond to high fitness ratios, while neighborhoods of strong 
selectivity correspond to high fitness ratios. In this translation, 
regimes of accelerated convergence can be identified directly with 
their approximate time complexities (Fig. 3). 



Fig. 4. Summary of low and high fitness approximations and their 
effects on population growth. Lower right shows the comparison 
between exact and approximate solutions. Note that the exact and high 
fitness regimes essentially overlap. 



Fig. 5. Comparative time complexities (takeover times) for the 
approximate limits. Short times correspond to regions of accelerated 
convergence. 



Fig. 6. Comparative computational complexity between various sort 
routines. 

To summarize, the work has analyzed the population dynamics of 
a general GA in fitness space. The iterated population yields recursion 
relations and a convenient formalism is developed for the approxi- 
mate cases of long and short times as well as high and low fitness 
values. The validity of these closed forms is compared to the exact 
result. By using the elegant tool of partial summation (Euler’s device) 
then the analytical basis of the GA is prepared to handle discontinu- 
ous changes in parameters without loss of mathematical generality. 
For example, the introduction or removal of strings can be safely 
accounted for by using the discrete summation formalism in a par- 
ticularly transparent way. An additional advantage in using the partial 
series summation (Fig. 1) is to relax the requirements for long 
(infinite) time series. In practice, most GAs run no longer than a few 
decades of generational time steps. Future work will examine the 
potential for abrupt population changes or more dynamic fitness 
landscapes (time dependent optimizations) in a focused effort to 
understanding GA processing and convergence. 
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